A "good" model, trained on relevant data, assigns high probability to a logical sentence, resulting in low perplexity. A "bad" model, trained on poor or insufficient data, is more "surprised" by the same sentence, leading to high perplexity.